Convolutional Neural Networks (CNN)


By Prof. Seungchul Lee
http://iai.postech.ac.kr/
Industrial AI Lab at POSTECH

Table of Contents

1. Convolution¶

1.1. 1D Convolution¶


1.2. Convolution on Image (= Convolution in 2D)¶

Filter (or Kernel)

  • Modify or enhance an image by filtering
  • Filter images to emphasize certain features or remove other features
  • Filtering includes smoothing, sharpening and edge enhancement

  • Discrete convolution can be viewed as element-wise multiplication by a matrix


How to find the right Kernels

  • We learn many different kernels that make specific effect on images

  • Let’s apply an opposite approach

  • We are not designing the kernel, but are learning the kernel from data

  • Can learn feature extractor from data using a deep learning framework

2. Convolutional Neural Networks (CNN)¶

2.1. Motivation: Learning Visual Features¶


The bird occupies a local area and looks the same in different parts of an image. We should construct neural networks which exploit these properties.



  • ANN structure for object detecion in image

    • does not seem the best
    • did not make use of the fact that we are dealing with images
    • Spatial organization of the input is destroyed by flattening



  • Locality: objects tend to have a local spatial support
    • fully and convolutionally connected layer $\rightarrow$ locally and convolutionally connected layer



- __Translation invariance__: object appearance is independent of location - Weight sharing: untis connected to different locations have the same weights - We are not designing the kernel, but are learning the kernel from data - _i.e._ We are learning visual feature extractor from data

2.2. Convolutional Operator¶

Convolution of CNN

  • Local connectivity
  • Weight sharing
  • Typically have sparse interactions

  • Convolutional Neural Networks

    • Simply neural networks that use the convolution in place of general matrix multiplication in at least one of their layers
  • Multiple channels


  • Multiple kernels


2.3 Stride and Padding¶

  • Strides: increment step size for the convolution operator
    • Reduces the size of the output map
  • No stride and no padding


  • Stride example with kernel size 3Ɨ3 and a stride of 2


  • Padding: artificially fill borders of image
    • Useful to keep spatial dimension constant across filters
    • Useful with strides and large receptive fields
    • Usually fill with 0s


2.4. Nonlinear Activation Function¶


2.5. Pooling¶

  • Compute a maximum value in a sliding window (max pooling)
    • Reduce spatial resolution for faster computation
    • Achieve invariance to any permutation inside one of the cell


  • Pooling size : $2\times2$ for example

2.6. CNN for Classification¶

  • CONV and POOL layers output high-level features of input
  • Fully connected layer uses these features for classifying input image
  • Express output as probability of image belonging to a particular class



3. Lab: CNN with TensorFlow (MNIST)¶

  • MNIST example
  • To classify handwritten digits



3.1. Training¶

InĀ [15]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
InĀ [2]:
mnist = tf.keras.datasets.mnist

(train_x, train_y), (test_x, test_y) = mnist.load_data()

train_x, test_x = train_x/255.0, test_x/255.0
train_x = train_x.reshape((train_x.shape[0], 28, 28, 1))
test_x = test_x.reshape((test_x.shape[0], 28, 28, 1))
InĀ [3]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, 
                           (3,3), 
                           activation = 'relu',
                           padding = 'SAME',
                           input_shape = (28, 28, 1)),
    
    tf.keras.layers.MaxPool2D((2,2)),
    
    tf.keras.layers.Conv2D(64, 
                           (3,3), 
                           activation = 'relu',
                           padding = 'SAME',
                           input_shape = (14, 14, 32)),
    
    tf.keras.layers.MaxPool2D((2,2)),
    
    tf.keras.layers.Flatten(),
    
    tf.keras.layers.Dense(128, activation = 'relu'),
    
    tf.keras.layers.Dense(10, activation = 'softmax')
])
WARNING:tensorflow:From c:\users\seungchul lee\appdata\local\programs\python\python36\lib\site-packages\tensorflow\python\ops\init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
InĀ [4]:
model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy',
              metrics = ['accuracy'])
InĀ [5]:
model.fit(train_x, train_y, epochs = 3)
Epoch 1/3
60000/60000 [==============================] - 29s 486us/sample - loss: 0.1198 - acc: 0.9630
Epoch 2/3
60000/60000 [==============================] - 27s 457us/sample - loss: 0.0398 - acc: 0.9879 - - ETA: 0s - loss: 0.0397 - acc: 0.9
Epoch 3/3
60000/60000 [==============================] - 29s 476us/sample - loss: 0.0263 - acc: 0.9917
Out[5]:
<tensorflow.python.keras.callbacks.History at 0x1779bdb6f98>

3.2. Testing or Evaluating¶

InĀ [6]:
test_loss, test_acc = model.evaluate(test_x, test_y)

print('loss = {}, Accuracy = {} %'.format(round(test_loss,2), round(test_acc*100)))
10000/10000 [==============================] - 1s 129us/sample - loss: 0.0322 - acc: 0.9896
loss = 0.03, Accuracy = 99.0 %
InĀ [7]:
test_img = test_x[np.random.choice(test_x.shape[0], 1)]

predict = model.predict_on_batch(test_img)
mypred = np.argmax(predict, axis = 1)

plt.figure(figsize = (12,5))

plt.subplot(1,2,1)
plt.imshow(test_img.reshape(28, 28), 'gray')
plt.axis('off')
plt.subplot(1,2,2)
plt.stem(predict[0])
plt.show()

print('Prediction : {}'.format(mypred[0]))
c:\users\seungchul lee\appdata\local\programs\python\python36\lib\site-packages\ipykernel_launcher.py:12: UserWarning: In Matplotlib 3.3 individual lines on a stem plot will be added as a LineCollection instead of individual lines. This significantly improves the performance of a stem plot. To remove this warning and switch to the new behaviour, set the "use_line_collection" keyword argument to True.
  if sys.path[0] == '':
Prediction : 3

4. Lab: CNN with Tensorflow (Steel Surface Defects)¶

  • NEU steel surface defects example
  • To classify defects images into 6 classes



Download NEU steel surface defects images and labels

4.1. Training¶

InĀ [16]:
train_x, train_y = np.load('./data_files/NEU_train_imgs.npy'), np.load('./data_files/NEU_train_labels.npy')
test_x, test_y = np.load('./data_files/NEU_test_imgs.npy'), np.load('./data_files/NEU_test_labels.npy')

train_x, test_x = train_x/255.0, test_x/255.0
train_x = train_x.reshape((train_x.shape[0], 200, 200, 1))
test_x = test_x.reshape((test_x.shape[0], 200, 200, 1))
InĀ [17]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, 
                           (3,3), 
                           activation = 'relu',
                           padding = 'SAME',
                           input_shape = (200, 200, 1)),
    
    tf.keras.layers.MaxPool2D((2,2)),
    
    tf.keras.layers.Conv2D(64, 
                           (3,3), 
                           activation = 'relu',
                           padding = 'SAME',
                           input_shape = (100, 100, 32)),
    
    tf.keras.layers.MaxPool2D((2,2)),
    
    tf.keras.layers.Conv2D(128, 
                           (3,3), 
                           activation = 'relu',
                           padding = 'SAME',
                           input_shape = (50, 50, 64)),
    
    tf.keras.layers.MaxPool2D((2,2)),
    
    tf.keras.layers.Flatten(),    
    tf.keras.layers.Dense(128, activation = 'relu'),    
    tf.keras.layers.Dense(6, activation = 'softmax')
])
InĀ [18]:
model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy',
              metrics = ['accuracy'])
InĀ [19]:
model.fit(train_x, train_y, epochs = 4)
Epoch 1/4
1500/1500 [==============================] - 37s 25ms/sample - loss: 1.6960 - acc: 0.2767
Epoch 2/4
1500/1500 [==============================] - 38s 25ms/sample - loss: 0.9103 - acc: 0.6440
Epoch 3/4
1500/1500 [==============================] - 39s 26ms/sample - loss: 0.4528 - acc: 0.8427
Epoch 4/4
1500/1500 [==============================] - 41s 28ms/sample - loss: 0.2718 - acc: 0.9073
Out[19]:
<tensorflow.python.keras.callbacks.History at 0x1779d652748>

4.2. Testing or Evaluating¶

InĀ [21]:
test_loss, test_acc = model.evaluate(test_x, test_y)

print('loss = {}, Accuracy = {} %'.format(round(test_loss,2), round(test_acc*100)))
300/300 [==============================] - 2s 7ms/sample - loss: 0.2848 - acc: 0.9000
loss = 0.28, Accuracy = 90.0 %
InĀ [23]:
name = ['scratches', 'rolled-in scale', 'pitted surface', 'patches', 'inclusion', 'crazing']

idx = np.random.choice(test_x.shape[0], 1)
test_img = test_x[idx]
GT = test_y[idx]

predict = model.predict_on_batch(test_img)
mypred = np.argmax(predict, axis = 1)

plt.figure(figsize = (12,5))

plt.subplot(1,2,1)
plt.imshow(test_img.reshape(200, 200), 'gray')
plt.axis('off')
plt.subplot(1,2,2)
plt.stem(predict[0])
plt.show()

print('Prediction : {}'.format(name[mypred[0]]))
print('Ground Truth : {}'.format(name[GT[0]]))
c:\users\seungchul lee\appdata\local\programs\python\python36\lib\site-packages\ipykernel_launcher.py:16: UserWarning: In Matplotlib 3.3 individual lines on a stem plot will be added as a LineCollection instead of individual lines. This significantly improves the performance of a stem plot. To remove this warning and switch to the new behaviour, set the "use_line_collection" keyword argument to True.
  app.launch_new_instance()
Prediction : inclusion
Ground Truth : inclusion
InĀ [14]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')